Linux Cubed Series 2: Applications

home *** CD-ROM | disk | FTP | other *** search

/ Linux Cubed Series 2: Applications / Linux Cubed Series 2 - Applications.iso / editors / emacs / xemacs / xemacs-1.006 / xemacs-1 / lib / xemacs-19.13 / info / w3.info-3 < prev next >

Wrap

GNU Info File | 1995-09-01 | 48.5 KB | 1,197 lines

This is Info file ../info/w3.info, produced by Makeinfo-1.63 from the input file w3.texi. This file documents the Emacs-w3 World Wide Web browser. Copyright (C) 1993, 1994, 1995 William M. Perry Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. File: w3.info, Node: Mailcap File, Prev: Specifying Viewers, Up: MIME Support Mailcap File ============ NCSA Mosaic and almost all other WWW browsers rely on a separate file for mapping MIME types to external viewing programs. This takes some of the burden off of browser developers, so each browser does not have to support all image formats, or postscript, etc. Instead of having the users of Emacs-w3 duplicate this in lisp, this file can be parsed using the `mm-parse-mailcaps' function. This function is called each time w3 is loaded. It tries to locate mimetype files in several places. If the environment variable `MAILCAPS' is nonempty, then this is assumed to specify a UNIX-like path of mimetype files (this is a colon separated string of pathnames). If the `MAILCAPS' environment variable is empty, then Emacs-w3 looks for these files: 1. `~/.mailcap' 2. `/etc/mailcap' 3. `/usr/etc/mailcap' 4. `/usr/local/etc/mailcap' This format of this file is specified in RFC 1343, but a brief synopsis follows (this is taken verbatim from sections of RFC 1343). Each mailcap file consists of a set of entries that describe the proper handling of one media type at the local site. For example, one line might tell how to display a message in Group III fax format. A mailcap file consists of a sequence of such individual entries, separated by newlines (according to the operating system's newline conventions). Blank lines and lines that start with the "#" character (ASCII 35) are considered comments, and are ignored. Long entries may be continued on multiple lines if each non-terminal line ends with a backslash character ('\', ASCII 92), in which case the multiple lines are to be treated as a single mailcap entry. Note that for such "continued" lines, the backslash must be the last character on the line to be continued. Each mailcap entry consists of a number of fields, separated by semi-colons. The first two fields are required, and must occur in the specified order. The remaining fields are optional, and may appear in any order. The first field is the content-type, which indicates the type of data this mailcap entry describes how to handle. It is to be matched against the type/subtype specification in the "Content-Type" header field of an Internet mail message. If the subtype is specified as "*", it is intended to match all subtypes of the named content-type. The second field, view-command, is a specification of how the message or body part can be viewed at the local site. Although the syntax of this field is fully specified, the semantics of program execution are necessarily somewhat operating system dependent. The optional fields, which may be given in any order, are as follows: * The "compose" field may be used to specify a program that can be used to compose a new body or body part in the given format. Its intended use is to support mail composing agents that support the composition of multiple types of mail using external composing agents. As with the view- command, the semantics of program execution are operating system dependent. The result of the composing program may be data that is not yet suitable for mail transport--that is, a Content-Transfer-Encoding may need to be applied to the data. * The "composetyped" field is similar to the "compose" field, but is to be used when the composing program needs to specify the Content-type header field to be applied to the composed data. The "compose" field is simpler, and is preferred for use with existing (non-mail-oriented) programs for composing data in a given format. The "composetyped" field is necessary when the Content-type information must include auxilliary parameters, and the composition program must then know enough about mail formats to produce output that includes the mail type information. * The "edit" field may be used to specify a program that can be used to edit a body or body part in the given format. In many cases, it may be identical in content to the "compose" field, and shares the operating-system dependent semantics for program execution. * The "print" field may be used to specify a program that can be used to print a message or body part in the given format. As with the view-command, the semantics of program execution are operating system dependent. * The "test" field may be used to test some external condition (e.g. the machine architecture, or the window system in use) to determine whether or not the mailcap line applies. It specifies a program to be run to test some condition. The semantics of execution and of the value returned by the test program are operating system dependent. If the test fails, a subsequent mailcap entry should be sought. Multiple test fields are not permitted--since a test can call a program, it can already be arbitrarily complex. * The "needsterminal" field indicates that the view-command must be run on an interactive terminal. This is needed to inform window-oriented user agents that an interactive terminal is needed. (The decision is not left exclusively to the view-command because in some circumstances it may not be possible for such programs to tell whether or not they are on interactive terminals.) The needsterminal command should be assumed to apply to the compose and edit commands, too, if they exist. Note that this is NOT a test--it is a requirement for the environment in which the program will be executed, and should typically cause the creation of a terminal window when not executed on either a real terminal or a terminal window. * The "copiousoutput" field indicates that the output from the view-command will be an extended stream of output, and is to be interpreted as advice to the UA (User Agent mail- reading program) that the output should be either paged or made scrollable. Note that it is probably a mistake if needsterminal and copiousoutput are both specified. * The "description" field simply provides a textual description, optionally quoted, that describes the type of data, to be used optionally by mail readers that wish to describe the data before offering to display it. * The "x11-bitmap" field names a file, in X11 bitmap (xbm) format, which points to an appropriate icon to be used to visually denote the presence of this kind of data. * Any other fields beginning with "x-" may be included for local or mailer-specific extensions of this format. Implementations should simply ignore all such unrecognized fields to permit such extensions, some of which might be standardized in a future version of this document. File: w3.info, Node: Security, Next: Non-Unix Operating Systems, Up: Top Security ******** * Menu: * Basic:: The 'Basic' authentication scheme for HTTP/1.0 * Digest:: The 'Digest' authentication scheme for HTTP/1.0 * SSL:: Secure Sockets Layer from Netscape, and how to enable it in Emacs-w3. * PGP/PEM:: Using PGP/PEM to encrypt information There are an increasing number of ways to authenticate yourself to a web servivce. Emacs-w3 tries to support as many as possible. File: w3.info, Node: Basic, Next: Digest, Prev: Security, Up: Security HTTP/1.0 Basic Authentication ============================= The weakest authentication available, not recommended if you are at all serious about security on your web site. This is simply a string that looks like `user:password' that has been Base64 encoded, as defined in RFC 1421. It is given as an example of how to write an authorization module. All of the functions for storing, retrieving, and over-writing the cached authorization information should all be handled by one function (although it would be perfectly acceptable to have a stub function that passed off to three larger functions based on its parameters). The most efficient way to store the cached information is by an assoc-list of assoc-lists. The top level assoc list is keyed on the name of the server. The secondary assoc-list is keyed on the full path of the file that is protected. Thus, a sample authorization cache would look like this: ((``info.cern.ch'' . ((``/foo'' . ``d21wZXJyeTp0ZXN0aW5n'') (``/bar'' . ``amtvbnJhdGg6ZGlzbWVtYmVy'') (``/foo/x.html'' . ``dmlvbGV0dDpvcGVuZ2w=''))) (``cs.indiana.edu'' . ((``/elisp/w3/'' . ``dGxvb3M6Y29ucXVlcg=='') (``/'' . ``bXZhbmhleW46a2lsbGh1bGljaw==''))) ) The structure consists of two assoc-lists for the sake of speed. The list of cached information could conceivably hold several thousand links (if the user does not exit Emacs for long periods of time.) If the list were keyed on a full URL, the assoc function would have to search through every link before failing to find a new URL. With the current scheme, assoc only has to search though a few items (maximum is the number of HTTP servers, which should always be much, much smaller than the number of distinct URLs.) Even with a 3:1 ratio of URLs to each server, this is a big win. File: w3.info, Node: Digest, Next: SSL, Prev: Basic, Up: Security HTTP/1.0 Digest Authentication ============================== Jeffery L. Hostetler, John Franks, Philip Hallam-Baker, Ari Luotonen, Eric W. Sink, and Lawrence C. Stewart have an internet draft for a new authentication mechanism. For the complete specification, please see draft-ietf-http-digest-aa-01.txt in your nearest internet drafts archive(1). What follows is mainly taken from the March 24, 1995 version of the internet draft. The protocol referred to as "HTTP/1.0" includes specification for a Basic Access Authentication scheme. This scheme is not considered to be a secure method of user authentication, as the user name and password are passed over the network in an unencrypted form. A specification for a new authentication scheme is needed for future versions of the HTTP protocol. This document provides specification for such a scheme, referred to as "Digest Access Authentication". The Digest Access Authentication scheme is not intended to be a complete answer to the need for security in the World Wide Web. This scheme provides no encryption of object content. The intent is simply to facilitate secure access authentication. Like Basic Access Authentication, the Digest scheme is based on a simple challenge-response paradigm. The Digest scheme challenges using a nonce value. A valid response contains the MD5 checksum of the password and the given nonce value. In this way, the password is never sent in the clear. Just as with the Basic scheme, the username and password must be prearranged in some fashion. If a server receives a request for an access-protected object, and an acceptable Authorizatation header is not sent, the server responds with: HTTP/1.1 401 Unauthorized WWW-Authenticate: Digest realm="<realm>", domain="<domain>", nonce="<nonce>", opaque="<opaque>", stale="<TRUE | FALSE>" The meanings of the identifers used above are as follows: `<realm>' A name given to users so they know which username and password to send. `<domain> OPTIONAL' A comma separated list of URIs, as specified for HTTP/1.0. The intent is that the client could use this information to know the set of URIs for which the same authentication information should be sent. The URIs in this list may exist on different servers. If this keyword is omitted or empty, the client should assume that the domain consists of all URIs on the responding server. `<nonce>' A server-specified integer value which may be uniquely generated each time a 401 response is made. Servers may defend themselves against replay attacks by refusing to reuse nonce values. The nonce should be considered opqaue by the client. `<opaque> OPTIONAL' A string of data, specified by the server, which should returned by the client unchanged. It is recommended that this string be base64 or hexadecimal data. Specifically, since the string is passed in the header lines as a quoted string, the double-quote character is not allowed. `<stale> OPTIONAL' A flag, indicating that the previous request from the client was rejected because the nonce value was stale. If stale is TRUE, the client may wish to simply retry the request with a new encrypted response, without reprompting the user for a new username and password. The client is expected to retry the request, passing an Authorization header line as follows: Authorization: Digest username="<username>", -- required realm="<realm>", -- required nonce="<nonce>", -- required uri="<requested-uri>", -- required response="<digest>", -- required message="<message-digest>", -- OPTIONAL opaque="<opaque>" -- required if provided by server where <digest> := H( H(A1) + ":" + N + ":" + H(A2) ) and <message-digest> := H( H(A1) + ":" + N + ":" + H(<message-body>) ) where: A1 := U + ':' + R + ':' + P A2 := <Method> + ':' + <requested-uri> with: N -- nonce value U -- username R -- realm P -- password <Method> -- from header line 0 <requested-uri> -- uri sans proxy/routing Where H() is the RSA Data Security, Inc. MD5 Message-Digest Algorithm (2). Upon receiving the Authorization information, the server may check its validity by looking up its known password which corresponds to the submitted <username>. Then, the server must perform the same MD5 operation performed by the client, and compare the result to the given <response>. Note that the HTTP server does not actually need to know the user's clear text password. As long as H(A1) is available to the server, the validity of an Authorization header may be verified. All keyword-value pairs must be expressed in characters from the US-ASCII character set, excluding control characters. ---------- Footnotes ---------- (1) One is ftp://ds.internic.net/internet-drafts (2) RFC 1321. R.Rivest, "The MD5 Message-Digest Algorithm", http://ds.internic.net/rfc/rfc1321.txt, April 1992. File: w3.info, Node: SSL, Next: PGP/PEM, Prev: Digest, Up: Security SSL === SSL is the `Secure Sockets Layer' interface developed by Netscape Communications (1). In order to use SSL in Emacs-w3, you will need one of the reference implementations of SSL that are publicly available. These are the implementations that I am aware of: `SSLRef 2.0' Available from Netscape Communications at http://www.netscape.com/newsref/std/sslref.html. This requires the RSARef library, which is not exportable. The RSARef library is available from ftp://ftp.rsa.com/rsaref/ `SSLeay 0.4' An implementation by Eric Young (eay@mincom.oz.au) that is free for commerial or noncommercial use, and was developed completely outside the US by a non-US citizen. More information can be found at ftp://ftp.psy.uq.oz.au/pub/Crypto/SSL/ Whichever reference implementation you choose to download (I recommend the SSLeay distribution, just to thumb a nose at the NSA :), you must have a program you can run in a subprocess that takes a hostname and port number on the command line, and reads/writes to standard input/output (the Netscape implementation comes with one of these by default). Once you hvae this program, set the variable `ssl-program-name' to point to the executable. This should be all you need to do. In the future, I will be distributing a set of patches to Emacs 19.xx and XEmacs 19.xx to SSL-enable them, for the sake of speed. NOTE: This implementation does not support the use of client certificates, but then nobody else supports that area of the protocol either, so I'm not too worried about it. ---------- Footnotes ---------- (1) http://www.netscape.com/ File: w3.info, Node: PGP/PEM, Prev: SSL, Up: Security PGP/PEM ======= Most of this section was taken from the documentation written by Rob McCool robm@ncsa.uiuc.edu. Gratefully reproduced here with permission from him.(1). RIPEM is 'Riordan's Internet Privacy Enhanced Mail', and is currently on version 1.2b3. US citizens can ftp it from ripem.msu.edu:/pub/crypt/ripem. PGP is 'Pretty Good Privacy', and is currently on version 2.6. The legal controversies that plagued earlier versions have been resolved, so this is a competely legal program now. There is also a legal version for european users, called 2.6ui (the Unofficial International version). PGP and PEM are programs to allow you and a second party to communicate in a way which does not allow third parties to read them, and which certify that the person who sent the message is really who they claim they are. PGP and PEM both use RSA encryption. The U.S. government has strict export controls over foreign use of this technology, so people outside the U.S. may have a difficult time finding programs which perform the encryption. You will need a working copy of either Pretty Good Privacy or RIPEM to begin with. You should be familiar with the program and have generated your own public/private key pair. You should be able to use the TIS/PEM program with the PEM authorization type. I haven't tried it. This tutorial is written assuming that you are using RIPEM. Currently, the protocol has been implemented with PEM and PGP using local key files on the server side, and on the client side with PEM using finger to retrieve the server's public key. As you can tell, parties who wish to use Emacs-w3 and httpd with PEM or PGP encryption will need to communicate beforehand and find a tamper-proof way to exchange their public keys. Pioneers get shot full of arrows. This work is currently in the experimental stages and thus may have some problems that I have overlooked. The only known problem that I know about is that the messages are currently not timestamped. This means that a malicious user could record your encrypted message with a packet sniffer and repeat it back to the server ad nauseum. Although they would not be able to read the reply, if the request was something you were being charged for, you may have a large bill to pay by the time they're through. This protocol is almost word-for-word a copy of Tony Sander's RIPEM based scheme, generalized a little. Below, wherever you see PEM you can replace it with PGP and get the same thing. *Client:* GET /docs/protected.html HTTP/1.0 UserAgent: Emacs-W3/2.1.x *Server:* HTTP/1.0 401 Unauthorized WWW-Authenticate: PEM entity="webmaster@hoohoo.ncsa.uiuc.edu" Server: NCSA/1.1 *Client:* GET / HTTP/1.0 Authorization: PEM entity="robm@ncsa.uiuc.edu" Content-type: application/x-www-pem-request --- BEGIN PRIVACY-ENHANCED MESSAGE --- this is the real request, encrypted --- END PRIVACY-ENHANCED MESSAGE --- *Server:* HTTP/1.0 200 OK Content-type: application/x-www-pem-reply --- BEGIN PRIVACY-ENHANCED MESSAGE --- this is the real reply, encrypted --- END PRIVACY-ENHANCED MESSAGE --- That's it. Emacs-w3 uses the excellent mailcrypt package written by Jin S Choi jsc@mit.edu.(2). This package takes care of calling ripem and/or pgp with the correct arguments. Please see the documentation at the top of mailcrypt.el for instructions on using mailcrypt. All bug reports about mailcrypt should go to Jin S Choi, but bugs about how I use it in Emacs-w3 should of course be directed to me. ---------- Footnotes ---------- (1) See http://hoohoo.ncsa.uiuc.edu/docs/PEMPGP.html (2) Available via anonymous ftp to archive.cis.ohio-state.edu in /pub/gnu/emacs/elisp-archive/interfaces/mailcrypt.el.Z File: w3.info, Node: Non-Unix Operating Systems, Next: VMS, Prev: Security, Up: Top Non-Unix Operating Systems ************************** * Menu: * VMS:: The wonderful world of VAX|AXP-VMS! * OS/2:: The next-best thing to Unix. * MS-DOS:: The wonderful world of MS-DOG! * 16-Bit Windows:: Windows 3.1, 3.11, and WFW 3.11. * 32-Bit Windows:: Windows NT, Chicago/Windows 95. * Macintosh:: The wonderful world of Macintrash! * Amiga:: The Amiga, for those who still love them. File: w3.info, Node: VMS, Next: OS/2, Prev: Non-Unix Operating Systems, Up: Non-Unix Operating Systems VMS === :: WORK :: File: w3.info, Node: OS/2, Next: MS-DOS, Prev: VMS, Up: Non-Unix Operating Systems OS/2 ==== :: WORK :: File: w3.info, Node: MS-DOS, Next: 16-Bit Windows, Prev: OS/2, Up: Non-Unix Operating Systems MS-DOS ====== :: WORK :: File: w3.info, Node: 16-Bit Windows, Next: 32-Bit Windows, Prev: MS-DOS, Up: Non-Unix Operating Systems 16-Bit Windows ============== :: WORK :: File: w3.info, Node: 32-Bit Windows, Next: Macintosh, Prev: 16-Bit Windows, Up: Non-Unix Operating Systems 32-Bit Windows ============== :: WORK :: File: w3.info, Node: Macintosh, Next: Amiga, Prev: 32-Bit Windows, Up: Non-Unix Operating Systems Macintosh ========= :: WORK :: File: w3.info, Node: Amiga, Next: Advanced Features, Prev: Macintosh, Up: Non-Unix Operating Systems Amiga ===== :: WORK :: File: w3.info, Node: Advanced Features, Next: Style Sheets, Prev: Amiga, Up: Top Advanced Features ***************** * Menu: * Style Sheets:: Formatting control, the right way * Disk Caching:: Speeding performance by using a local disk cache * Searching:: How to search entire sections of the web * Interfacing to Mail/News:: How to make VM understand hypertext links * Debugging HTML:: How to make Emacs-w3 display warnings about invalid HTML/HTML+ constructs. * Native WAIS Support:: How to make Emacs-w3 understand WAIS links without using a gateway. * Rating Links:: How to make Emacs-w3 put an 'interestingness' value next to each link. * Gopher Plus Support:: How Emacs-w3 makes use of the Gopher+ protocol. * Hooks:: Various hooks to use throughout Emacs-w3 * Other Variables:: Miscellaneous variables that control the real guts of Emacs-w3. File: w3.info, Node: Style Sheets, Next: Disk Caching, Prev: Advanced Features, Up: Advanced Features Style Sheets ============ Emacs-w3 currently supports the experimental style sheet mechanism proposed by H&kon W. Lie of the W3 Consortium. This allows for the author to specify what a document should look like, and yet allow the end user to override any of the stylistic changes. This allows for people with special needs (most notably the visually impaired) to override style bindings that could make a document totally unreadable. A stylesheet consists of comments and directives. A comment is any line starting with a #, and is terminated by the end of the line. A directive includes the tag name, an attribute name, and a value. A sample stylesheet is: <style notation="experimental"> # This line is a comment # These will be ignored, up the the terminating end-of-line # h1: align=center h1: color.text=yellow h1: color.background=red h1: font.size *= 2 </style> Below is a comprehensive list of the attribute names. `color.text' Specifies the foreground color of the text for this item. `color.background' Specifies the background color of the text for this item. `background.bitmap' Specifies a bitmap to be used as the background for this item. `font.size' Specifies the font size. This can be specified with the +=, -=, /=, or *= operator, signifying a change from the default font size. For example, font.size *= 2 would mean a font twice as large as the default font. `font.style' Specifies the font style. This controls whether a font is bold, italic, underlined, or any combination of these. The value can be a comma or ampersand (&) separated list of values. `font.family' Specifies the font family - this is the basic type of font. Note that not all font families will be available on all platforms, or even the same platform in a slightly different configuration. If the specified font family cannot be found on the machine, the default font is used instead. `align' Specifies how the text contained within the item is to be aligned. Possible values are left, right, justify, center, or indent. `width' Specifies how wide the item should be. This is only used for horizontal rules (<HR>) tags right now. To include a stylesheet into your document, simply use the <style> tag. You can use the notation attribute to specify what language the stylesheet is specified in. The default is experimental. The data between the <style> and </style> tags is the stylsheet proper - no HTML parsing is done to this data - it is treated similar to an <XMP> section of text. To reference an external stylesheet, you should use the <link> tag. <link rel="stylesheet" href="/bill.style"> If these two mechanisms are mixed, then the URL is resolved first, and the contents of the <style> tag take precedence if there are any conflicting directives. In the future, DSSSL and DSSSL-lite will be supported as valid stylesheet languages, but not in this release. File: w3.info, Node: Disk Caching, Next: Searching, Prev: Style Sheets, Up: Advanced Features Disk Caching ============ A cache stores the information on a page on your local machine. When requesting a page that is in the cache, Emacs-w3 can retrieve the page from the cache more quickly than retrieving the page again from its location out on the network. With a well-populated cache, the speed of browsing the web is dramatically increased. The first time a page is requested, Emacs-w3 retrieves the page from the network. When requesting a page that is in the cache, Emacs-w3 checks to see if the page has changed since it was last retrieved from the remote machine. If it has not changed, the local copy is used, saving the transmission of the file over the network. To turn on disk caching, set the variable `url-automatic-caching' to non-`nil', or choose the 'Caching' menu item (under `Options'). That is all there is to it. It is recommended that you use the `clean-cache' shell script fist, to allow for future cleaning of the cache. This shell script will remove all files that have not been accessed since it was last run. To keep the cache pared down, it is recommended that this script be run from at or cron (see the manual pages for crontab(5) or at(1) for more information) With a large cache of documents on the local disk, it can be very handy when traveling, or any other time the network connection is not active (a laptop with a dial-on-demand PPP connection, etc). Emacs-w3 can rely solely on its cache, and avoid checking to see if the page has changed on the remote server. In the case of a dial-on-demand PPP connection, this will keep the phone line free as long as possible, only bringing up the PPP connection when asking for a page that is not located in the cache. This is very useful for demonstrations as well. To turn this feature on, set the variable `url-standalone-mode' to non-`nil', or choose the `Use Cache Only' menu item (under `Options') Emacs-w3 caches files under the temporary directory specified by `url-temporary-directory', in a user-specific subdirectory (determined by the `user-real-login-name' function). The cache files are stored under their original names, so a URL like: http://www.spry.com/foo/bar/baz.html would be stored in a cache file named: /tmp/wmperry/com/spry/www/foo/bar/baz.html. Sometimes, espcially with gopher links, there will be name conflicts, and an error will be signalled. This cannot be avoided, and still have reasonable performance at startup time (reading in an index file of all the cached pages can take a long time on slow machines, or even fast machines with large caches). If you are running XEmacs 19.12 or later, you can use an alternate naming scheme that avoids name conflicts, but loses the human readability of the cache file names. The cache files will look like: /tmp/wmperry/acbd18db4cc2f85cedef654fccc4a4d8, which is certainly unique, but not very user-friendly. To turn this on, add this to your `.emacs' file: (add-hook 'w3-load-hooks '(lambda () (fset 'url-create-cached-filename 'url-create-cached-filename-using-md5))) If you will not be using other emacs variants, I highly recommend this method of creating the cache filename. File: w3.info, Node: Searching, Next: Interfacing to Mail/News, Prev: Disk Caching, Up: Advanced Features Searching ========= In the file `w3-search.el' is a function that some may find handy. It is not 100% completed yet, so if you run into any problems with it, please try to fix it, not just say its broken. The function is `w3-do-search'. It must be called with at least one argument. All others are optional. Arguments are TERM, BASE, HOPS-LIMIT, and RESTRICTION. This recursively descends all the child links of the current document for TERM. TERM may be a string, in which case it is treated as a regular expression, and `re-search-forward' is used, or a symbol, in which case it is funcalled with 1 argument, the current URL being searched. BASE is the URL to start searching from. HOPS-LIMIT is the maximum number of nodes to descend before the search dies out. RESTRICTION is a regular expression or function to call with one argument, a URL that could be searched. If RESTRICTION returns non-`nil', then the URL is added to the queue, otherwise it is discarded. This is useful for restricting searching to either certain types of URLs (only search ftp links), or restricting searching to one domain (only search stuff in the indiana.edu domain). You may check several variables from the main `w3-do-search' routine in any functions passed to it (as RESTRICTION or TERM). QUEUE is the queue of links to be searched, HOPS is the current number of hops from the root document, RESULTS is an assoc list of (URL . RETVAL), where RETVAL is the value returned from previous calls to the TERM function (or point if searching for a regular expression). The function returns a list of the form: ((URL . RETVAL)...) Please note that there is no interactive use for this function yet--it was designed for non-interactive, batch-mode processing. However, if anyone wants to write a wrapper function for it, please feel free. File: w3.info, Node: Interfacing to Mail/News, Next: Debugging HTML, Prev: Searching, Up: Advanced Features Interfacing to Mail/News ======================== More and more people are including URLs in their signatures, and within the body of mail messages. It can get quite tedious to type these into the minibuffer to follow one. To access URLs with VM, the following in your `~/.emacs' or `~/.vm' files should do the trick. It adds two keybindings to the main VM message window. The middle mouse button now tries to follow a hypertext link. (add-hook 'vm-mode-hook (function (lambda () (define-key vm-mode-map [mouse-2] 'w3-maybe-follow-link-mouse) (define-key vm-mode-map "\r" 'w3-maybe-follow-link)))) To access URLs with RMAIL, the following in your `~/.emacs' file should do the trick. (add-hook 'rmail-mode-hook (function (lambda () (define-key rmail-mode-map [mouse-2] 'w3-maybe-follow-link-mouse) (define-key rmail-mode-map "\r" 'w3-maybe-follow-link)))) To access URLs with GNUS, the following in your `~/.emacs' file should od the trick. (add-hook 'gnus-article-mode-hook (function (lambda () (define-key gnus-article-mode-map [mouse-2] 'w3-maybe-follow-link-mouse) (define-key gnus-article-mode-map "\r" 'w3-maybe-follow-link)))) NOTE: XEmacs 19.12 has a special version of VM and GNUS that does the highlighting of URLs automatically. All that is required to follow one of these links is clicking the middle mouse button on the highlighted text. File: w3.info, Node: Debugging HTML, Next: Native WAIS Support, Prev: Interfacing to Mail/News, Up: Advanced Features Debugging HTML ============== If you are feeling adventurous, or are just as anal as I am about people writing valid HTML, you can set the variable `w3-debug-html' to `t' and see what happens. If a emacs-w3 thinks it has encountered invalid HTML, then a debugging message is logged to the buffer specified by `w3-debug-buffer'. This can be a buffer object, or the name of a buffer. NOTE: This has not yet been reintegrated into the new display engine and parser. File: w3.info, Node: Native WAIS Support, Next: Rating Links, Prev: Debugging HTML, Up: Advanced Features Native WAIS Support =================== This version of Emacs-W3 supports native WAIS querying (earlier versions required the use of a gateway program). In order to use the native WAIS support, a working "waisq" binary is required. I recommend the distribution from think.com - ftp://think.com/wais/wais-8-b6.1.tar.Z is a good place to start. The variable `url-waisq-prog' must point to this executable, and one of `url-wais-gateway-server' or `url-wais-gateway-port' should be `nil'. When a WAIS URL is encountered, a form will be automatically generated and displayed. After typing in your search term, the query will be sent to the server by running the `url-waisq-prog' in a subprocess. The results will be converted into HTML and displayed. File: w3.info, Node: Rating Links, Next: Gopher Plus Support, Prev: Native WAIS Support, Up: Advanced Features Rating Links ============ The `w3-link-delimiter-info' variable can be used to 'rate' a URL when it shows up in an HTML page. If non-`nil', then this should be a list specifying (or a symbol specifying the name) of a function. This function should expect one argument, a fully specified URL, and should return a string. This string is inserted after the link text. If a user has decided that all links served from blort.com are too laden with images, and wants to be warned that a link points at this host, they could do something like this: (defun check-url (url) (if (string-match "://[^/]blort.com" url) "[SLOW!]" "")) (setq w3-link-delimiter-info 'check-url) So that all links pointing to any site at blort.com shows up as "Some link[SLOW!]" instead of just "Some link". File: w3.info, Node: Gopher Plus Support, Next: Hooks, Prev: Rating Links, Up: Advanced Features Gopher+ Support =============== The gopher+ support in Emacs-w3 is limited to the conversion of ASK blocks into HTML 3.0 forms, and the usage of the content-length given by the gopher+ server to give a nice status bar on the bottom of the screen. This will hopefully be extended to include the Gopher+ method of content-type negotiation, but this may be a while. File: w3.info, Node: Hooks, Next: Other Variables, Prev: Gopher Plus Support, Up: Advanced Features Hooks ===== These are the various hooks that can be used to customize some of Emacs-w3's behavior. They are arranged in the order in which they would happen when retrieving a document. All of these are functions (or lists of functions) that are called consecutively. `w3-load-hooks' These hooks are run by `w3-do-setup' the first time a URL is fetched. All the w3 variables are initialized before this hook is run. `w3-file-done-hooks' These hooks are run by `w3-prepare-buffer' after all parsing on a document has been done. All `url-current-'* and `w3-current-'* variables are initialized when this hook is run. This is run before the buffer is shown, and before any inlined images are downloaded and converted. `w3-file-prepare-hooks' These hooks are run by `w3-prepare-buffer' before any parsing is done on the HTML file. The HTTP/1.0 headers specified by `w3-show-headers' have been inserted, the syntax table has been set to `w3-parse-args-syntax-table', and any personal annotations have been inserted by the time this hook is run. `w3-mode-hooks' These hooks are run after a buffer has been parsed and displayed, but before any inlined images are downloaded and converted. `w3-source-file-hooks' These hooks are run after displaying a document's source File: w3.info, Node: Other Variables, Prev: Hooks, Up: Advanced Features Miscellaneous variables ======================= There are lots of variables that control the real nitty-gritty of Emacs-w3 that the beginning user probably shouldn't mess with. Here they are. `w3-icon-directory-list' A list of directorys to look in for the w3 standard icons... must end in a /! If the directory `data-directory'/w3 exists, then this is automatically added to the default value of http://cs.indiana.edu/elisp/w3/icons/. `w3-keep-old-buffers' Whether to keep old buffers around when following links. If you do not like having lots of buffers in one Emacs session, you should set this to `nil'. I recommend setting it to `t', so that backtracking from one link to another is faster. `url-passwd-entry-func' This is a symbol indicating which function to call to read in a password. It is set up depending on whether you are running "EFS" or "ange-ftp" at startup if it is `nil'. This function should accept the prompt string as its first argument, and the default value as its second argument. `w3-reuse-buffers' Determines what happens when `w3-fetch' is called on a document that has already been loaded into another buffer. Possible values are: `nil', `yes', and `no'. `nil' will ask the user if Emacs-w3 should reuse the buffer (this is the default value). A value of `yes' means assume the user wants to always reuse the buffer. A value of `no' means assume the user always wants to re-fetch the document. `w3-show-headers' This is a list of HTTP/1.0 headers to show at the end of a buffer. All the headers should be in lowercase. They are inserted at the end of the buffer in a <UL> list. Alternatively, if this is simply `t', then all the HTTP/1.0 headers are shown. The default value is `nil'. `w3-show-status, url-show-status' Whether to show progress messages in the minibuffer. `w3-show-status' controls if messages about the parsing are displayed, and `url-show-status' controls if a running total of the number of bytes transferred is displayed. These Can cause a large performance hit if using a remote X display over a slow link, or a terminal with a slow modem. `mm-content-transfer-encodings' An assoc list of CONTENT-TRANSFER-ENCODINGS or CONTENT-ENCODINGS and the appropriate decoding algorithms for each. If the `cdr' of a node is a list, then this specifies the decoder is an external program, with the program as the first item in the list, and the rest of the list specifying arguments to be passed on the command line. If using an external decoder, it must accept its input from `stdin' and send its output to `stdout'. If the `cdr' of a node is a symbol whose function definition is non-`nil', then that encoding can be handled internally. The function is called with 2 arguments, buffer positions bounding the region to be decoded. The function should completely replace that region with the unencoded information. Currently supported transfer encodings are: base64, x-gzip, 7bit, 8bit, binary, x-compress, x-hqx, and quoted-printable. `url-uncompressor-alist' An assoc list of file extensions and the appropriate uncompression programs for each. This is used to build the Accept-encoding header for HTTP/1.0 requests. `url-waisq-prog' Name of the waisq executable on this system. This should be the `waisq' program from think.com's wais8-b5.1 distribution. File: w3.info, Node: More Help, Next: Future Directions, Up: Top More Help ********* If you need more help on Emacs-w3, please send me mail (wmperry@spry.com). Several discussion lists have also been created for Emacs-w3. To subscribe, send mail to majordomo@indiana.edu, with the body of the message 'subscribe LISTNAME <YOUR EMAIL ADDRES>'. All other mail should go to <listname>@indiana.edu. * w3-announce - this list is for anyone interested in Emacs-w3, and should in general only be used by me. The gnu.emacs.sources newsgroup and a few other mailing lists are included on this. You may use this if you have written an enhancement to Emacs-w3 that you wish more people to know about. (www-announce@w3.org is included on this list). * w3-beta - this list is for beta testers of Emacs-w3. These brave souls test out not-quite stable code. * w3-dev - a list consisting of myself and a few other people who are interested in the internals of Emacs-w3, and doing active development work. Pretty dead right now, but I hope it will grow. If you need more help on the World Wide Web in general, please refer to the newsgroup comp.infosystems.www. There are also several discussion lists concerning the Web. Send mail to listserv@w3.org with a subject line of 'subscribe <listname>'. All mail should go to <listname>@w3.org. Administrative mail should go to www-admin@w3.org. The lists are: * www-talk - for general discussion of the World Wide Web, where its going, new features, etc. All the major developers are subscribed to this list. * www-announce - for announcements concerning the World Wide Web. Server changes, new servers, new software, etc. As a last resort, you may always mail me. I'll try to answer as quickly as I can. File: w3.info, Node: Future Directions, Next: Programming Interface, Prev: More Help, Up: Top Future Directions ***************** Changes are constantly being made to the Emacs browser (hopefully all for the better). This is a list of the things that are being worked on right now. Fix before 2.3 1. Imagemap extensions (drag areas) 2. PATHs 3. TABLEs 4. MATHs 5. DSSSL and DSSSL-Lite Style sheets Long range goals 1. Multi-DTD browsing File: w3.info, Node: Programming Interface, Next: Generalized ZONES, Prev: Future Directions, Up: Top Internals of Emacs-w3 ********************* This chapter attempts to explain some of the internal workings of Emacs-w3 and various data structures that are used. It also details some functions that are useful for using some of the Emacs-w3 functionality from within your own programs, or extending the current capabilities of Emacs-w3. * Menu: * Generalized ZONES:: A generic interface to 'zones' of text that can contain information. * Global Variables:: Global variables used throughout Emacs-w3 * Data Structures:: The various data structures used in Emacs-w3 * Miscellaneous Functions:: Miscellaneous functions you can use to interface with w3 and access its data structures * MIME functions:: MIME functions--parsing messages, mailcap files, and more. File: w3.info, Node: Generalized ZONES, Next: Global Variables, Prev: Programming Interface, Up: Programming Interface Programming Interface Generalized ZONES ================= Due to the many different flavors of Emacs in existence, the addition of data and font information to arbitrary regions of text has been generalized. The following functions are defined for using/manipulating these "zones" of data. `w3-add-zone (start end style data &optional highlight)' This function creates a zone between buffer positions start and end, with font information specified by style, and a data segment of data. If the optional argument highlight is non-`nil', then the region highlights when the mouse moves over it. `w3-zone-at (point)' Returns the the zone at POINT. Preference is given to hypertext links, then to form entry areas, then to inlined images. So if an inlined image was part of a hypertext link, this would always return the hypertext link. `w3-zone-data (zone)' Returns the zone's data segment. The data structures used in Emacs-w3 are relatively simple. They are just list structures that follow a certain format. The two main data types are "form objects", "link objects",and "inlined images". All the information for these types of links are stored as lists. `w3-zone-hidden-p (zone)' Returns `t' if and only if a zone is currently invisible. `w3-hide-zone (start end)' Makes a region of text from `start' to `end' invisible. `w3-unhide-zone (start end)' Makes a region of text from `start' to `end' visible again. `w3-zone-start (zone)' Returns an integer that is the start of zone, as a buffer position. In Emacs 18.xx, this returns a marker instead of an integer, but it can be used just like an integer. `w3-zone-end (zone)' Returns an integer that is the end of zone, as a buffer position. In Emacs 18.xx, this returns a marker instead of an integer, but it can be used just like an integer. `w3-zone-eq (zone1 zone2)' Returns `t' if and only if zone1 and zone2 represent the same region of text in the same buffer, with the same properties and data. `w3-delete-zone (zone)' Removes zone from its buffer (or current buffer). The return value is irrelevant, and varies for each version of Emacs. `w3-all-zones ()' Returns a list of all the zones contained in the current buffer. Useful for extracting information about hypertext links or form entry areas. Programs should not rely on this list being sorted, as the order varies with each version of Emacs. `w3-zone-at (pt)' This returns the zone at character position PT in the current buffer that is either a link or a forms entry area. Returns `nil' if no link at point. These data structures are what is generally returned by `w3-zone-data'. File: w3.info, Node: Global Variables, Next: Data Structures, Prev: Generalized ZONES, Up: Programming Interface Global variables ================ There are also some variables that may be useful if you are writing a program or function that interacts with Emacs-w3. All of the `w3-current-*' variables are local to each buffer. `url-current-mime-headers' An assoc list of all the MIME headers for the current document. Keyed on the lowercase MIME header (e.g., `content-type' or `content-encoding'. `url-current-server' Server that the current document was retrieved from. `url-current-file' Filename of the current document `url-current-type' A string representing what network protocol was used to retrieve the current buffer's document. Can be one of http, gopher, file, ftp, news, or mailto. `url-current-port' Port # of the current document. `w3-current-last-buffer' The last buffer seen before this one. `w3-running-FSF19' This is `t' if and only if we are running in FSF Emacs 19. `w3-running-epoch' This is `t' if and only if we are running in Epoch 4.x `w3-running-xemacs' This is `t' if and only if we are running in Lucid Emacs, WinEmacs, or XEmacs.